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ABSTRACT 


We  present  a  new  technique,  called  T-,  scaling,  for 
determining  scale  estimates  from  paired  comparisons  data.   We 
present  the  new  method  in  conjunction  with  a  sensitivity 
diagnostic  that  ascertains  the  extent  to  which  intransitive 
elements  in  the  data  influence  the  scale  estimates  from  the 
Thurstonian  judgment  scaling  model.   The  T^  scale  estimates, 
based  upon  the  minimization  of  absolute  deviations  rather  than 
least  squares,  are  relatively  insensitive  to  the  presence  of 
limited  inconsistency.   We  apply  the  new  solution  technique, 
shown  to  be  a  straightforward  minimum  cost  network,  flow  problem, 
to  several  scaling  problems  in  the  literature.   V/hen  no  single 
limited  source  of  inconsistency  is  indicated,  the  scale  estimates 
thus  obtained  are  consistent  with  the  least  squares  estimates. 
When  isolated  departures  from  the  scaling  model  or  possible  data 
errors  are  present,  the  T,  procedure  remains  largely  insensitive 
to  their  presence,  preserving  the  interval  scale  properties  of 
the  estimates. 
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I ntroduct ion 


This  papar  presents  a  new  solution  technique,  called  T  i 
scaling,  for  determining  scale  estimates  from  paired  comparisons 
data.   The  development  of  the  T,  procedure  v,'a3  motivated  by  a 
concern  for  the  substantial  influence  of  intransitive  elements  in 
the  data  on  the  final  solution  produced  by  the  least  squares 
approach  of  the  Thurstonian  judgment  scaling  model  (Thurstone, 
1927a).   The  new  technique  utilizes  a  discrete  L   linear 
approximation  based  upon  the  minimization  of  absolute  deviations 
(see  Barrodale,  1968,  and  Barrodale  and  Roberts,  1973),  where  the 
special  structure  of  the  scaling  problem  allows  it  to  be  solved 
quite  efficiently  as  a  minimum  cost  network  flow  problem,  using 
standard  techniques  presented  in  such  sources  as  Bradley,  Hax, 
and  Magnanti  (1977)  or  Shapiro  (1979).   The  scale  estimates  thus 
provided  are  in  some  sense  more  "robust"  than  in  the  traditional 
approach  in  that  they  discount  the  influence  of  limited 
inconsistency  in  the  data. 

The  balance  of  the  paper  follows  in  several  sections. 
The  first  section  reviews  the  least  squares  solution  technique 
for  the  Thurstonian  judgment  scaling  model.   The  second  section 
reviews  Mosteller's  goodness-of- f i t  measure  for  the  least  squares 
estimates  (hosteller,  1951),  and  demonstrates  how  seriously  this 
fit  deteriorates  in  the  presence  of  limited  inconsistency.   In 
order  to  do  this,  we  develop  a  sensitivity  diagnostic  to  assess 
the  relative  influence  of  each  pair  of  items  on  the  determination 


of  the  least  squares  estimates.   The  third  section  presents  the 
robust  T   scaling  technique  and  shows  that  obtaining  these  scale 
estimates  is  equivalent  to  solving  a  minimum  cost  network  flow 
problem.   Finally,  we  apply  the  T   approach  to  several  problems 
from  the  literature,  and  compare  the  results  to  the  least  squares 
scale  estimates. 


I .   The  Law  ot  Comparative  Judgment  and  the  Thurstonian  Judgment 
Scaling  Model 

In  the  typical  judgment  scaling  problem,  v/e  are  presented 
with  k  different  objects,  each  exhibiting  some  degree  of  a 
certain  common  characteristic.   If  this  characteristic,  such  as 
"height",  "weight",  or  "age",  is  a    directly  measureable  quality 
of  singular  dimensionality,  chen  we  can  order  these  k  objects  by 
placing  them  along  a  continuum  at  the  measured  value  of  their 
common  characteristic.   The  positions  of  these  objects,  or  scale 
values,  have  the  properties  of  the  measurement  scale.   For 
example,  objects  ordered  on  the  basis  of  height  or  weight  have 
scale  values  v/ith  ratio  properties,  while  objects  ordered  on  the 
basis  of  heat  in  degrees  Farenheit  have  interval  properties. 

V/hen  the  objects  share  a  common  characteristic  that  is 
not  directly  measurable,  such  as  "beauty"  or  "softness",  the 
ordering  of  the  objects  must  depend  upon  some  subjective  estimate 
of  the  common  characteristic  exhibited  by  each  object.   In  order 
to  facilitate  the  process  of  ordering  the  objects  along  a 
continuum  without  an  apparent  scale,  the  method  of  paired 
comparisons  is  used  to  exact  a  set  of  relative  judgments  from  an 
observer.   Thus,  for  any  given  pair  of  objects  the  observer  is 
required  only  to  judge  v;hich  of  the  two  exceeds  the  other  with 
respect  to  the  underlying  characteristic.   This  set  of  pairwise 
judgments  is  used  to  determine  scale  values  with  interval 
properties. 

The  law  of  comparative  judgment  established  the 


theoretical  foundations  for  Thurstone's  judgment  scaling  model. 

Each  object,  when  presented  to  the  observer,  acts  as  a  stimulus 

which  excites  a  certain  discriminal  process  within  the  observer. 

Due  to  changing  conditions  in  the  experimental  situation  or 

fluctuations  within  the  observer,  the  same  stimulus  might  trigger 

a  slightly  different  process,  such  that  the  position  of  the 

stimulus  on  the  specific  psychological  continuum  is  not  always 

the  same.   For  example,  an  observer's  subjective  estimate  of  the 

"beauty"  of  an  object  might  be  different  when  presented  with  the 

object  a  second  time,  on  account  of  the  observer's  mood,  the  time 

of  day,  or  the  temperature  of  his  surroundings. 

In  the  Thurstonian  model,  the  distribution  of  these 

subjective  estimates  along  the  continuum  is  postulated  to  be 

normal.   The  standard  deviation  of  this  distribution  is  called 

the  disciminal  dispersion  of  the  stimulus,  and  the  mean  is  taken 

to  be  the  true  scale  value.   The  distributions  of  two  stimuli,  i 

and  J,  might  thus  be  represented  as  in  Figure  I.l.   The  scale 

values  are  s   and  s  ,  and  the  discriminal  dispersions  are  the 
1       J 

standard  deviations,  o^  and  cf^  .   The  discriminal  processed  within 
the  observer  are  random  variables  denoted  i.  and  d  •. 

It  is  now  possible  to  talk  about  a  discriminal 
difference ,  (d  .  -  d  .)  ,  for  any  pair  of  stimuli  i  and  j.   If  i  and 
j  are  presented  to  an  observer  a  large  number  of  times,  the 

discriminal  differences  will  also  form  a  normal  distribution, 

2     2  1/2 

with  standard  deviation  0j=Ca.  +a-   ~  2r .  .  a-o.)  where  r.  . 

a    1     J       1]  1  J  IT 


is  the  correlation  between  the  discriminal  processes  associated 
wi  th  i  and  j  . 


FIGURE  I.l 

The  discriminal  distributions  for  stimuli  1  and  j  , 

centered  about  the  true  scale  values  s.  and  s.. 

1      J 


The  observer,  of  course,  is  unable  to  assign  a  value  to 

the  position  of  the  stimulus  aloiig  the  appropriace  psychological 

continuum,  but  when  presented  wi tn  two  stimuli,  he  is  able  to 

judge  which  of  the  two  is  greater.   In  some  cases,  because  the 

distributions  overlap,  the  observer  may  judge  stimulus  i  to  be 

greater  than  j  even  though  s.  is  actually  greater  than  s. .   Over 

a  large  number  of  comparisons,  it  is  possible  to  determine  the 

approximate  proportions  of  times  stimulus  j  is  judged  greater 

than  i.   These  proportions  are  then  used  to  determine  the 

relative  positions  of  s.  and  s.  on  the  continuum  measuring  their 

1       D  - 

common  quality. 

Figure  1.2  shows  the  distribution  of  the  discriminal 
difference  between  i  and  j,  where  the  shaded  portion  of  the  curve 
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FIGURE    1.2 
The  discriminal  difference  between  stimuli  i  and  j . 


indicates   the    proportion   of    times    stimulus    j    appears   greater    than 

stimulus    i    to    the    observer.      The   value    x. .     is    the   difference 

ID 

between  s.  and  s   measured  in  o   units.   Hence,  s.  -  s.  =  x. .  a  t, 
X      ]  d  '   3     1     13   d' 

or  in  its  final  form: 


(1) 


s   -  s   =  X    (of  +  of  -  2r..o.a.)-/^ 


■In  this  form,  without  limiting  assumptions,  the  law  of 
comparative  judgment  is  not  solvable,  as  there  are  many  more 
unknowns  than  equations.   Thurstone  presented  five  cases  of  the 
law,  introducing  certain  assumptions  into  the  model  to  make  it 
tractable.   His  case  V  is  the  most  restrictive,  assuming  constant 
standard  deviations  for  all  of  the  discriminal  dispersions,  and 


no  correlation  between  any  of  the  discrimlnal  processes  (implying 
zero  covariance  between  stimuli) .   Mosteller  (1951)  later  showed 
that  the  assanption  of  equal  correlations  between  processes  leads 
to  a  formulation  equivalent  to  Thurstone's  case  V,   In  either 
case,  the  unit  of  measurement  for  the  psychological  scale  may  be 
determined  arbitrarily;   hence,  the  constant  modifying  the  x^ 
term  in  (1)  may  be  taken  to  be  unity,  leaving 


ID 


(2) 


s  . 


ID 


The  law  of  comparative  judgment,  limited  by  assumptions 
of  equal  dispersions  and  equal  covariances,  is  most  frequently 
estimated  using  paired  comparisons  data.   In  this  procedure,  we 
present  each  pair  of  stimuli  to  the  observer  a  large  number  of 
times,  as  described  above.   If  we  wish  to  examine  the  collective 
discriminal  process  of  an  entire  population,  we  present  each 
stimulus  pair  i,j  to  each  individual  in  the  population  only  onca. 
Thurstone,  in  his  case  II  of  the  law  of  comparative  judgment, 
shov/ed  that  the  same  formulation  holds  true  for  either  approach, 
under  certain  assumptions  of  homogeneity. 

The  observed  proportion  of  times  stimulus  j  exceeds  i, 
p*,.f  forms  the  matrix  P'.   Matrix  P'  has  the  property  that 
symmetric  cells  must  sum  to  one;   hence,  d' . .  +  p' . .  =  1.   Matrix 
P'  determines  matrix  X',  where  each  element  x'..  is  the  unit 
normal  de,/iate  corresponding  to  the  observed  proportion  p'  .  .  .   If 
the  range  of  stimuli  along  the  psychological  continuum  is  large 
relative  to  the  discriminal  dispersion,  there  may  in  fact  be 
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cases  where  one  stimulus  is  never  judged  greater  than  another. 

If  stimulus  i  is  judged  less  than  j  every  time  the  pair  is 

presented,  the  value  p'ij  will  equal  one  and  the  value  x'-j_j  will 

approach  infinity.   This  problem  of  an  incomplete  X'  matrix  is 

usually  solved  by  establishing  upper  and  lower  bounds  on  p'ij  of 

.01  and  .99,  thus  insuring  stability  of  the  resulting  scale 

estimates . 

From  (2)  above,  we  see  that  the  difference  between  the 

estimates  of  any  two  scale  values,  s'.  and  s'.,  gives  us  x"..   an 

estimate  of  the  observed  value  x'. . ,  as  shown: 

ID 


(3: 


s'. 
3 


1 


ID 


With  errorless  data,  we  can  choose  scale  estimates  s'j  and  s'-  sc 

that  the  estimates  x"..  will  equal  the  observed  x'j_-:.   Typically, 

differences  between  the  observed  proportions  and  the  true  values 

lead  to  a  difference  between  x" . .  and  x'..,  no  matter  how  we 

ID        ID 

choose  s'.  and  s'..   Thurstone  chose  his  scale  estimates  to 
1        D 

minimize  Q,  the  sum  of  the  squared  deviations  between  x"-  ■  and 

X'  .  .: 

ID 


(4) 


Q  =  ZI  (x!  .  -  x"  ) 


ID 


ID 


ID 


Substituting  (3)  into  the  equation  above: 


(5) 


Q    =    IT.     (x!  . 
ij    '' 


2 

\   +  s : ) 

1        1 


Equation  (5)  is  equivalent  to  minimizing  either  row  sum.s  or 
column  sums,  so  Thurstone  limited  his  analysis  to  the  columns  of 
X'  . 

Differentiating  Q  with  respect  to  s'.  gives: 


(6) 


dQ 

ds'. 
D 


-2  I  (xl  .  -  s'.  +  sM 


Setting  the  partial  derivative  to  zero  and  solving 


(7) 


s'.  =  1   Xx:  .  +  1  Is! 
=•    k   i  ^^J    k  i  ^ 


where  k  is  the  number  of  objects  in  the  scaling  problem.   The 
rightmost  term  in  (7)  is  simply  the  mean  of  the  estimated  scale 
values.   Because  the  origin  of  the  psychological  continuum  is 
arbitrary,  we  can  take  it  to  be  the  mean  of  the  s'.,  giving: 


(8) 


s'.  =  1  Ix'  . 
J    k  i  ^3 


Thus,  the  least  squares  estimates  of  the  true  scale  values  are 
the  coluran  means  of  the  matrix  X'.   Torgerson  (1958)  presents  a 
more  detailed  discussion  of  this  derivation. 
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II .   The  Effects  of  Inconsistency  in    the  Observed  Data  on  the 
Determination  of  the  Thurstonian  Scale  Estimates 


The  method  of  paired  comparisons  does  not  always  produce  a 
set  of  data  appropriate  for  use  in  a  scaling  model  such  as 
Thurstone's.   If  there  is  perfect  agreement  among  the  judges  on 
the  ordering  of  the  k  objects  being  compared,  then  it  is  not 
possible  to  determine  scale  estimates  with  interval 
characteristics.   Another  problem  occurs  when  an  observer  or  some 
observers  are  particulary  bad  judges,  or  are  poorly  motivated  to 
take  the  care  required  to  produce  consistent  comparisons.   A 
third  problem  occurs  if  the  experimenter  asks  too  much  of  his 
observers;   the  objects  may  be  so  close  together  with  respect  to 
their  common  quality  that  distinguishing  them  becomes  almost  a 
guessing  game.   Finally,  it  is  possible  that  the  quality  common 
to  the  objects  under  examination  is  not  representabl e  as  a  linear 
variate.   When  any  one  or  several  of  these  difficulties  is 
present,  the  reported  preferences  may  contain  intransi tiv i ties 
called  circular  triads,  where  object  i  is  judged  greater  than 
object  j ,  j  is  judged  greater  than  k,  yet  k  is  ultimately  judged 
greater  than  i.   Such  an  ordering  is  impossible  to  represent  on  a 
single  dimensional  scale,  and  thus  interferes  with  the  process  of 
determining  scale  estimates. 

Kendall  and  Babington  Smith  (1940)  observed  that 
"[Thurstone's]  method  is  appropriate  where  one  is  entitled  to 
assume  a  priori  or  by  reason  of  precautions  taken  in  the 

11 


selection  of  material  that  a  linear  variable  is  involved  and  that 
there  exist  perceptible  differences  between  the  items  presented 
for  compar ison, " ( p.   342)  They  proposed  a  coefficient  of 
consistence,  ;;  ,  where 


f 


\ 


1  -  24d/(k  -  k)   k  odd 
1  -  24d/Ck^  -  4k)   k  even. 


where  k  is  the  number  of  objects  and  where  d  is  the  number  of 
circular  triads  reported.   The  coefficient  equals  one  when  the 
comparisons  data  contain  no  inconsistencies  and  equals  zero  when 
the  maximum  number  of  circular  triads  is  present.   Thus,  a  value 
of   ^  near  zero  indicates  potentially  troublesome  departures  from 
the  scaling  model. 

For  paired  comparisons  with  fewer  than  eight  objects, 
Kendall  and  Smith  also  calculated  the  probabilities  that  a  number 
of  circular  triads  d  or  greater  would  occur  under  a  completely 
random  ranking  scheme.   If  a  single  observer  reports  a  number  of 
circular  triads  d  that  is  likely  to  have  come  from  a  process  of 
unsystematic  (random)  judgment,  his  ability  to  discriminate 
between  objects  should  be  questioned;   if  a  number  of  observers 
do  the  same,  then  a  problem  may  lie  in  the  difficulty  of  the  task 
or  in  the  dimensionality  of  the  quality  under  judgment. 

Even  vv'hen  paired  comparisons  data  are  free  of  complete 
intransi tivi ty ,  there  is  usually  some  form  of  inconsistency 
present.   In  Figure  II. 1  below,  three  stimuli  a,  b,  and  c  are 
shown  equally  spaced  along  the  appropriate  psychological 
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continuum.   By  assumption,  their  discriminal  dispersions  are  all 
equal  (Thurstone's  case  V),  and  because  the  origin  for  the  scale 
is  arbitrary  it  has  been  placed  at  the  middle  scale  value.   If  we 
presented  an  observer  with  stimulus  pair  a,b  and  stimulus  pair 
b,c  a  total  of  n  times  each,  it  is  unlikely  (due  to  statistical 
fluctuation)  that  the  observer  would  report  a>b  exactly  the  same 
number  of  times  he  reported  b>c.   Even  so,  while  an  observer  may 
judge  a>b  and  b>c  approximately  the  same  number  of  times  each,  he 
might  judge  a>c  only  a  slightly  higher  number  of  times,  not 
necessarily  consistent  with  placing  c  twice  as  far  from  a  as  from 
b. 


FIGURE  II. 1 

True  underlying  model  for  the  three  stimulus  example  demon- 
strating that  inconsistency  need  not  take  the  form  of 
intransitivity. . 
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with  Thurstonian  scale  estimates,  stimulus  c  is 

positioned  on  zhe    scale  somewhat  closer  to  b  than  suggested  by 

the  proportion  of  times  b>c,  yet  not  so  close  to  a  as  suggested 

by  the  proportion  of  times  a>c.  This  "compromise"  leads  to 

discrepancies  between  the  observed  values  x', .  and  the  values 

ID 

x"    derived  from  the  scale  estimates,  leaving  a  question  as  to 
the  fit  of  the  final  result. 

With  fallible  data,  it  is  helpful  to  have  a  measure  of 
the  gcodnes3-of- f i t  of  the  least  squares  estimates.   Mosteller 
(1951)  presented  a  chi-square  significance  test  for  the  fit 
between  the  observed  proportions  p'..  and  the  fitted  proportions 
p" .  .   These  fitted  proportions  are  derived  from  the  scale 
estimates;   they  represent  the  proportion  cf  the  time  stimulus  i 
would  be  judged  greater  than  stimulus  j  if  the  true  scale  values 
were  actually  s'  .  and  s'  ..   We  can  use  the  unit  norm.al  table  co 
find  the  proportion  p"    corresponding  to  each  x" . .,  and  form  the 
matrix  of  fitted  proportions  P". 

Mosteller  suggested  the  arcsin  transformation  developed 
by  R.   A.   Fisher  to  establish  a  chi-square  testing  criterion- 
Given  proportions  p' . .  and  p" . .  from  a  binomial  sample  of  size  n, 

ID        ID 


e'  =  ar 


csin  yp' 


and 


arcsi 


n^ 


are  distributed  with  variance 


821 
n 


14 


when   g' ij  and  e  "ij  are  expressed  in  degrees.   Thus,  f-lostell  er 
suggests  the  following  test  of  the  goodness-of-f i t  of  the 
estimates : 


/ 


.   (GV.  -0!  .) 

i>i 

-'  S21/n 


where  n  is  the  total  number  of  times  each  stimulus  pair  is 

presented.   The  test  covers  the  elements  in  the  lower  triangular 

matrix;   thus,  for  a  scaling  problem  involving  k   stimuli,  the 

distribution  is  'v  x^  (  (  k-1 )  (  k-2  ) /2  )  . 

It  is  possible  to  assess  the  nature  of  the  effect  of 

inconsistency  on  the  fit  of  the  scale  estimates  by  constructing  a 

situation  in  which  a  single  circular  triad  is  exhibited  in 

otherwise  errorless  comparisons  data.   Figure  II. 2  shows  the 

placement  of  four  stimuli,  a,  b,  d,  and  e,  along  the 

psychological  continuum.   The  differences  between  these  actual 

scale  values  are  shown  in  the  matrix  X'  in  Table  II.  1.   The  fifth 

stimulus,  c,  is  represented  at  two  positions  on  the  scale.   With 

respect  to  all  stimuli  but  a,  c  is  positioned  at  -.10  on  the 

scale  (the  true  value,  c,   )  .   With  respect  to  a,  however,  c  is 

bde 

positioned  at  +.15  on  the  continuum  (c  ).   The  result  is  a  single 

a 

inaccurate  observation  for  stimulus  pair  a,c,  forming  the  single 
circular  triad  (a>b,  b>c,  c>a) . 

If  the  observed  proportion  of  times  that  c  was  judged 
greater  than  a  were  overlooked,  perhaps  discounted  as  a 
transcription  error,  the  remaining  data  in  X'  would  be  consistent 
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.15 


"bide 
I 


-^ 


-.05 


i 


.05 


15 


^^ 


FIGURE  I I. 2 

Contrived  five-stimulus  example,  where  inaccuracy  is  introduced 
into  the  observation  between  stimuli  a  and  c. 


a 

b 

c 

d 

e 

a 

- 

-.1 

.1 

.15 

-.20 

b 

.1 

- 

-.05 

.25 

-.1 

c 

-.1 

.15 

- 

.55 

-.05 

d 

-.15 

-.25 

-.55 

- 

-.35 

e 

.20 

.1 

.05 

.35 

.01 


.04  -.04 


21   -.14 


TABLE  II. 1 

Matrix  X'  for  the  contrived  five- 
stimulus  example  of  Figure  II. 2. 
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with  the  determination  of  scale  estimates  leading  to  a  perfect 

fit  between  P"  and  P'.   Including  the  inconsistency  produces  the 

distorted  scale  estimates  shown  in  Figure  II. 3.   Only  the  scale 

estimates  for  stimuli  a  and  c  are  different  from  the  true 

underlying  values;   stimuli  b,  d,  and  have  the  same  relative 

positions  as  in  the  actual  configuration.   Because  X'  is  a  skew 

symmetric  matrix  fx'    =  -•/.'       )     and  the  only  columns  affected  by 

ac      ca 

the  presence  of  mtransi tiv i ty  are  those  for  a  and  c;   the 
estimated  scale  values  differ  from  the  actual  values  by  the  same 
amount  in  opposite  direcitons.   As  shown  in  Figure  II. 3,  the 
scale  estimate  for  stimulus  c  is  .05  units  greater  than  its 
actual  value;   for  a,  it  is  .05  units  less. 


i 


"T^ 


bjc'    9 


M^ 


JL 


true  c 


true  a 


FIGURE  II. 3 

Thurstonian  scale  estimates   for  the   five-stimulus  example 
compared  to  the  true  values  for  stimuli  a  and  c. 


The   distortion    introduced    into    the    scale   due    to    the 
inaccurate   comparison   of   a    and    c   degrades    the    fit   of    the    least 
squares   estimates   noticeably.       In    this   example,    where    a    single 
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circular  triad  is  incroduced  into  otherwise  errorless  data,  the 
fitted  proportions  differ  from  the  observed  in  seven  out  of  10 
cases,  as  shown  in  Table  II. 2.   Because  the  least  squares 
procedure  operates  to  minimize  the  sum  of  squared  deviations,  a 
solution  resulting  in  several  small  discrepancies  is  preferred  to 
a  solution  v/ith  a  single  large  one.   Thus,  the  least  squares 
procedure  distorts  the  interval  properties  of  several  scale 
estimates  in  order  to  compensate  for  a  single  potentially 
problematic  observation. 


.5A 

- 

.516 

- 

,46 

.54 

- 

h 

.532 

.516 

- 

U 

.A4 

.401 

.353 

- 

»     J-J 

.417 

.401 

.386 

- 

.56 

.52 

.48 

.618 

.536 

.52 

.504 

.618 

TABLE  II. 2 

Comparison  of  observed  and  fitted  proportions  for  the  five-stimulus 
example,  showing  discrepancey  in  seven  out  of  10  cells  in  the 
lower  diagonal  matrix. 
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This  small  ad  hoc  analysis  of  limited  intr aasi tivi ty 
motivates  the  design  of  a  more  general  procedure.   It  would  be 
advantageous  to  have  a  diagnostic  to  determine  the  influence  of 
any  one  observation  on  the  overall  fit  of  the  model.   By 
replacing  the  value  in  each  cell  of  the  lower  diagonal  of  the 
matrix  X'  by  a  value  determined  from  the  other  relative 
comparisons,  and  then  assessing  the  fit  for  these  modified 
values,  it  is  possible  to  determine  the  improvement  in  fit 
associated  with  the  "discounting"  of  one  observation.   If  this 
improvement  is  substantial,  it  indicates  that  the  internal 
properties  of  the  initial  scale  may  have  been  degraded  by 
inconsistency.   If  the  inconsistency  can  be  traced  to  data 
transcription  error,  or  to  some  other  uncontrolled  influence 
operating  on  a  limited  portion  of  the  data,  we  might  want  to  turn 
to  a  more  robust  scaling  procedure  where  outlying  observations 
have  less  influence  and  the  discrepancy  in  fit  is  limited  to  as 
few  stimulus  pairs  as  possible. 

Consider  the  contrived  five-stimulus  example  shov/n  in 
Table  II. 1.   In  this  case,  we  introduced  an  intransi tivi ty  by 


perturbing  the  observed  value  x' 


If  we  could  somehow  discount 


this  observation,  so  that  scale  estimates  s'   and  s'   depended 

a       c 

only  on  the  relative  comparison  with  stimuli  b,  d,  and  e,  the 
resulting  value  would  reflect  the  proportion  of  times  stimulus  a 
was  reported  greater  than  c,  with  c  positioned  accurately.   V/e 
can  thus  use  the  concept  of  adjusting  a  stimulus  pair  to  design  a 
diagnostic  technique  for  determining  the  sensitivity  of  paired 
comparisons  data  to  inconsistency.   This  notion  of  sensitivity  is 
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1978)  ,  based 


similar  to  the  one  introduced  by  Hoaglin  and  Welsch 

loosely  on  the  influence  of  a  single  observation  on  the  fit  of 

the  entire  model  rather  than  simply  its  own  fitted  value. 

If  stimulus  pair  i,j  is  discounted,  only  one  value  in 
each  of  column  i  and  column  j  of  matrix  X'  changes;   therefore, 


all  ocher  scale  estimates  s'  ,  m  9^  i  or  3,  remain  unaltered.   We 

m 

can  use  thesa  k-2  unaffected  scale  estimates  to  adjust  the  values 

s'   and  s'  .   The  adjusted  estimate  's.  '   will  reflect  the  best 
i        J  1 

position  for  stimulus  i  relative  to  all  other  stimuli  by  j. 

Similarly,  s.'   will  reflect  the  best  position  for  stimulus  j 

without  considering  direct  comparison  to  stimulus  i. 

The  following  sets  of  equations  determine  the  adjusted 

scale  estimates  s  '   and  's  '  : 

1        3 


s'  -  s'   =  V 

i    1     "li 


^'.  -  s!  =  x!  ^  . 

1   1-1   1-1,1 

^i       ^i+l"   ^i+l,i 


5» 

'i 


1 


J-1        J-1,1 


^j+r  '^j+i.i 


b\    -    s'   =  X,'  . 
1    k.     k.,1 


s!  -  s!   =  x' 


e!  -  s:  =  x'   . 

J  1-1   1-1, J 

's'  -  s'   -  x' 

^j  ^i+1  ^i+l,j 


t  - 

3 


s!  ,=  x!  ,  . 

J-1   :-i,j 


j    J+1   j+l,j 


^«    »     t 

'2   -   ^k  =  \,i 


k-2 
equations 
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With  infallible  data,  such  as  that  found  in  the  five-stimulus 
example  in  Table  il.i  above,  all  k-2  equations  render  exactly  the 
same  value  for  the  adjusted  scale  estimate.   However,  since 
paired  comparisons  data  rarely  offer  perfect  observations,  we 
again  choose  to  use  a  least  squares  approach  to  oDtain  values  for 
Si  '   and  Sj  '  . 

Our  results  above  show  that  the  mean  of  the  k-2 
equations  gives  the  least  squares  solution: 


1 


^' 


=  1 


Z-J  (x'  .  +  s') 

k-2   m^l  ^'^     "^ 

k 

-^  ZZr  (X'   .   +  S') 

k-2   ^  ^'3 


Rearranging  terms  for  s •  '   gives: 


si 

1 


=   1 


71  X  •  .  +  ^H 


1^-2  ~:        m,i     1^^  —.         m 
m-i  m=i 


niT^i,  j 


m?fi,  j 


Because  the  scale  origin  has  been  arbitrarily  centered  at  the 
mean  of  the  scale  estimates,  the  equation  above  becomes: 
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i     v^  X  ,   X'  .   "  -^  (-s!  -  s'.  ) 


k-2  • 

rriT^i,  j 


Similarly,  because  s'   is  equal  to  the  mean  of  column  i  of  X', 

i 

the  equation  aboye  becomes: 

^:  =  1  (ks!  -  X'. .)  +  1  (-s:-G' ) 

■   ^    F--2     ^     ^^     ^^2     ^   3 


with  some  rearrangement  of  terms,  the  adjusted  scale  estimates 
s  '   and  s  '   can  be  written: 


(9)        ^i  =   ^s'  .  <-ji  "  -]^ 

k-2         k-2 


(10)        A,  ^  ^_^  ^^,  _  ^^.  ^    ^,^ 


k-2 


^3 


:-2 


Examination  of  (9)  and  (10)  reveals  that  (s.'   -  s'.)  =  -(s, 

s*  ).   Thus,  the  adjusted  estimates  satisfy  the  symmetry 

J 
exhibited  in  the  contrived  five-stimulus  example,  i/^here  the  scale 

estimates  for  stimuli  a  and  c  moved  the  same  distance  in  the 

opposite  directions  from  their  true  scale  value. 

The  adjusted  scale  estimates  now  uniquely  determine  new 

values  for  x"    and  x"   ,   To  assess  the  change  in  fit  associated 

ij        ij 
with  adjusting  the  scale  estimates  for  stimuli  i  and  j,  we  form 

the  adjusted  matrices  X"  and  P",  denoted  X"(i,j)  and  P"(i,j),  and 
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use  Hosteller's  chi-square  test  with  ( (k-1 ) (k-2 ) /2  -  1)  degrees 

of  freedom. 

We    applied    the   diagnostic    procedare    to    the   contrived 

five-stimulus    example    from   Table    II. 1.       For    each   stimulus    pair 

i,j,    we    adjusted    the    values    of    s'       and    s'       and    calculated    the 

i       J 

change  in  fit.   The  results  showed  little  or  no  improvement:  in 
fit  for  all  but  the  stimulus  pair  a,c.   For  that  pair,  the 
adjusted  scale  estimates  for  a  and  c  equaled  the  true  scale 
values  for  these  stimuli,  eliminating  the  source  of 
intransi tivity  in  the  otherwise  errorless  model  and  indicating  a 
complete  improvement  in  fit. 

In  general,  the  diagnostic  serves  to  identify  sources 
of  limited  inconsistency  or  intransi  ti  \/ity  in  the  paired 
comparisons  data.   If  the  data  are  widely  inconsistent,  then 
several  of  the  scale  estimates  are  liable  to  depart  significantly 
from  the  true  scale  values.   Using  the  diagnostic  to  adjust  the 
scale  estimates  for  a  single  stimulus  pair  might  eliminate  the 
inconsistency  introduced  by  that  particular  observation,  but  the 
adjusted  estimates  would  still  reflect  the  inconsistencies  that 
influenced  the  positioning  of  the  other  stimuli.   Such  widespread 
inconsistency,  while  not  readily  detectable  by  the  diagnostic, 
usually  shows  up  in  a  poor  overall  goodness-of-f i t ,  indicating  a 
departure  from  the  assumptions  made  for  the  one  dimensional 
scaling  model. 

Kendall  and  Babington  Smith's  coefficient  of 
consistency  is  a  valuable  tool  for  identifying  failure  by  a 
single  observer  to  adequately  discriminate  between  stimuli. 
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However,  in  Thurstone's  case  II  of  the  law  of  comparative 
judgment,  where  the  responses  of  several  judges  are  used  to 
determine  the  observed  proportioiis  p'.  .  ,  all  of  the  observers  may 
give  completely  consistent  responses,  and  yat  the  composite 
comparisons  may  contain  inconsistency  or  even  complete 
intransi tivity ,  as  shown  in  the  example  in  Figure  II. 4.   Cur 
diagnostic  functions  as  a  computationally  inexpensive  indicator 
of  limited  inconsistency  that  is  potentially  damaging  to  the 
interval  properties  of  the  scale. 


Judges   1,4 

Judges   2,5 

Judges   3,6 

Composite 

A>  B 

B>C 
A>C 

OA 
A>B 
C>B 

B>C 
OA 
B>A 

A>B      (66%) 
B>C      (66"^) 
OA      (65%) 

FIGURE  I I. 4 


An  example  demonstrating  that  the  judgments  of  Derfectly  consistent 
observers  may  yield  a  perfectly  intransitive  composite  ordering. 
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III.   The  T-i  Scaling  Solution  Procedure 


Our  concern  for  obtaining  scale  estimates  that  are 
relatively  insensitive  to  the  presence  of  limited  error  or 
intransi  tivi  ty  in  the  observed  data  ir.otivates  the  development  of 
a  more  robust  scaling  prc^cedure.   The  weakness  v/ith  the 
Thurstonian  jugment  scaling  model  in  the  presence  of  limited 
inconsistency  is  that  it  is  based  on  an  L   linear  approximation 
of  the  underlying  true  values.   In  this  least  squares  approach, 
outlying  observations  tend  to  have  an  inordinate  amount  of 
influence  in  the  determination  of  the  scale  estimates.   Because 
least  squares  "prefers"  a  solution  with  several  small 
discrepancies  to  one  with  a  single  large  error,  the  Thurstonian 
procedure  propagates  limited  inconsistency  throughout  the  scale 
estimates,  degrading  the  interval  properties  of  the  entire  scale. 
Barrodale  and  Roberts  (1973)  suggest  that  when  the  data 
contain  inaccuracies  or  inconsistencies,  an  L^  approximation, 
minimizing  the  sum  of  the  absolute  deviations,  is  often  superior 
to  the  best  L   approximation  for  estimating  the  true  parameters 
of  the  model.   Thus,  the  L   approach  to  determining  scale 
estimates  requires  minimizing  the  quantity  Q  ,  where 


(11) 


i=l   i=l 


X : .  -  ( s •  -  si) 
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An    example    taken    froirt    simple    regression,    shov/.i    below    in    Figure 
III.l,    illustrates    their    point.       In    the    example,    the    true 
underlying    s/alues    follow   exactly   a    linear   model.      Only   one   of    the 
seven   observations   differs    from    its    true    value,    but    that 
difference    is   quite    substantial.      The    L2    approximation   operates 
to    distribute    this   error    across    all    seven    points,    and    thus    the 
seventh   observation   has    the    effect   of    tilting    down    the    slope    of 
the    regression    line    to    b'    and    raising    the    intercept    to    a'.      The 
L^    approximation    is    not    so    influenced    by   the    seventh   observation, 
and    reco>/ers    the    true   model    parameters,    a    and    b. 


y=a+bx;      true  model 
^.-^       L,    approximation 

y-a'fb'x:      L_,  approx. 


^''    observed  values 
r-  '■   true  values 


FIGURE  III.l 

Regression  example  demonstrating  the  relative 
insensitivity  of  the  L^  approximation  to 
outlying  observations."" 
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There  are  also  weaknesses  to  the  L^  approach  that 

should  not  be  overlooked.   The  L  procedure  occasionally  performs 

very  badly  in  the  presence  of  an  outlying  observation,  as  shown 

in  Figure  III. 2.   In  this  example,  the  seventh  observation 

deviates  so  substantially  from  its  true  value  that  a  better  fit 

is  obtained  by  approximating  the  model  using  only  the  first  and 

seventh  points  rather  than  using  the  first  six.   A  second 

weakness  of  the  L,  technique  with  respect  to  the  scaling  problem 

at  hand  is  that  it  has  far  greater  computational  requirements 

than  simply  using  column  means  to  estimate  the  scale  values. 

Therefore,  it  remains  for  us  to  show  that  the  L   approach  applied 

to  Thurstonian  scaling,  henceforth  denoted  T]^  scaling,  can  be 

solved  in  a  manner  that  is  computationally  convenient,  and  that 

under  reasonable  assumptions  regarding  the  behavior  of  the  data 

the  T   scale  estimates  are  superior  to  those  determined  bv  least 

1 
squares. 

_/K  ^  y=a+bx:   true  model 


O-    observed  values 
•<.-  true  values 


—  -  —  — y=a'+b'x: 


L  approx. 


FIGURE  II I. 2 

Regression  example  demonstrating  the  weakness  of 
L^  approximation  in  the  presence  of  a  wildly 
inaccurate  observation. 
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We  now  show  that  solving  the  T,  scaling  problem  for  k 
scale  estimates  is  equivalent  to  solving  a  capacitated  network 
flov^  problem  for  a  complete  network  with  k  nodes  (see  Bradley, 
9ax ,  and  Magnanti  for  a  description  of  this  type  of  problem) . 
Because  X'  is  skew  symmetric,  we  may  limit  our  analysis  to  the 
lower  diagonal  matrix,  writing  (11)  as  follows: 


(12) 


\=I2 


X :  .  -  ( s  •  -  s : ) 

13       3      -L 


i>D 


Using  standard  techniques,  the  minimization  problem 
described  above  may  be  written  as  a  linear  program: 

k(_k-l)/2 
Minimize  ''>~'^  (u     +  v  ) 

m=l 


m         m 


(13) 


subject   to       sl-sl+u     -v     =x 
J  1         m         m 


!  .      for  all  (i,j)  st   i>i 

^^    m  =  i,  2,   ...,  kcic-i)/: 


u   ,   V  5t  0,    si   unconstrained 
mm  J 


Thus,  for  a  scaling  problem  with  k  stimuli,  there  are  k(k-L)/2 

2 
constraints  and  k    variables.   To  solve  Thurstone's  crime  study 

(Thurstone,  i927b)  ,  a  relatively  large  scaling  problem  involving 

19  stimuli,  would  require  solving  a  linear  program  with  361 

variables  and  171  constraints,  a  significant  computational  task. 

We  rewrite  the  linear  program  below  using  matrix 

notation : 
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C14) 


Maximize    - ( lu  +  Iv) 
subject  to   As_'  +  lu  -  Iv  =  x'      :  iT 
u,  y^^O,   s_'  unconstrained 


v/here  1_   indicates  a  row  vector  of  ones,  and  the  objective 
function  has  been  multiplied  by  -1  in  order  to  cast  the  problem 
as  a  maximization.   Because  the  origin  for  the  scale  estimates  is 
arbitrarily  set,  it  is  appropriate  to  leave  the  s'.  unconstrained 
i  n  s  i  g  n . 

We  now  take  the  dual  of  (14) ,  and  find  that  we  can 
exploit  the  special  structure  of  the  constraint  matrix.  A: 


(15) 


Minimize  it  x' 
subject  to  _7r_  A  =  0 

JL^   -1 
-ttJJ.  -1 


s' 
u 

V 


The  linear  program  in  (15)  above  is  a  capacitated  network  flow 
problem,  as  A   is  the  appropriate  matrix  for  a  complete  network 
with  k  nodes.   The  vector  JL  /  constrained  to  the  interval  [-1,1], 
is  the  vector  of  arc  flows.   The  dual  variables  s_|_  associated 
with  the  equality  constraints  in  (15)  are  the  scale  estimates. 

Thus,  it  is  possible  to  reduce  the  T,  scaling  problem 
for  k  objects  to  a  capacitated  minimum  cost  network  flow  problem 
of  k  nodes.   Thurstone's  crime  study,  mentioned  above,  would 
require  only  171  variables  and  19  constraints,  which  is  not 
considered  a  very  large  network  problem.   Using  the  widely 
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accepted  network  packages,  such  as  GNST  (Bradley,  Brown,  and 
Graves,  1975),  which  exploit  the  special  structure  of  the  network 
basis,  the  problem  can  be  solved  quickly  and  efficiently. 

In  order  to  get  some  idea  of  how  well  T-,     scaling  does 
in  recovering  the  true  scale  values  of  a  model,  we  can  appeal  to 
this  network  conceptualization.   Figure  IIj:.3  below  shows  the 
contrived  five-stimulus  example  in  network  form.   The  sources  and 
sinks  of  network  flow  have  been  added  so  that  we  may  refer  to  arc 
flow  as  a  non-negative  quantity  in  the  interval  [0,2]  instead  of 
[-1,1].   The  costs  on  the  arcs  are  the  observed  values  ^'j_-\  ,    and 


the  reduced  costs  are  denoted 


ID 


For  a  more  detailed 


discussion  of  network  flow  problems,  associated  terminology,  and 
solution  procedures,  the  reader  should  refer  to  the  relevant 
chapters  in  Bradley,  Hax ,  and  Magnanti  or  Shapiro. 

Using  this  conceptual  framework,  we  can  make  several 
statements  about  the  performance  of  T-,  scaling: 
1 .   With  infallible  data,  the  T^  procedure  estimates  the  true 
scale  values  exactly  for  any  basic  feasible  solution  to  the 
network  flow  problem. 

This  is  seen  easily  by  configuring  the  network  in  a  straight 
line  (as  shown  in  Figure  III. 3  above)  ,  positioning  the  nodes  from 
left  to  right  in  the  order  indicated  by  the  true  scale  values  for 
the  stimuli  they  represent.   The  result  is  a  network  with  a  cost 
on  the  arc  directed  from  node  i  to  node  j  equal  to  the  true 
difference  between  the  two  scale  values.   Because  the  dual 
variables  (which  are  the  scale  estimates  s_|_)  are  determined  from. 
the  set  of  equations  s'-  -  s'-  =  x'j_^  for  all  arcs  (i,j)  in  the 
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c  •    c  '    c  '     c  *    c  ' 

abode 


Dual  variables  (scale  estimates) 


^'ab'^'ac'-'^'ad' 


Arc  costs 


ab  ac  ad  ae 


Arc  flows 


FIGURE  III. 5 


Network  representation  of  the  T  scaling  problem  for  the 
contrived  five-stimulus  example  of  section  II. 
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basis,  any  spanning  tree  solution  yields  the  same  set  of  dual 
variables:   always  the  true  scale  values,  with  arbitrary  origin. 
2.   With  one  inconsistent  observation  between  stimuli  i  and  j  in 
an  otherwise  infallible  set  of  data ,___  the  T   procedure  always 
reco vers  the  true  scale  values. 

We  demonstrate  this  result  by  representing  the 
inconsistent  observation  as  an  arc  cost  between  node  i  and  node  j 
equal  to  the  true  value  x'-  •  plus  some  perturbation  factor  a  . 
Without  loss  of  generality,  let  us  choose  our  initial  basic 
feasible  solution  so  that  the  arc  directed  from  node  i  to  node  j 
is  non-basic  and  at  its  lower  bound.   If   A   is  equal  to  zero, 
then  our  data  is  infallible,  and  by  statement  1  above,  any 
spanning  tree  solution  renders  the  exact  scale  estimates.   If  A 
becomes  positive  on  arc  (i,j) ,  then  we  have  no  motivation  to 
change  our  present  solution  and  the  dual  variables  which  are  the 
scale  estimates  remain  unaltered.   If  A   becomes  negative,  then 
we  can  reduce  the  cost  of  our  present  solution  by  using  the  arc 
at  some  positive  flow  capacity. 

Once  arc  {i,j)  enters  the  basis,  it  reaches  full 
capacity  and  subsequently  leaves  the  basis.   Otherwise,  at  least 
one  of  the  dual  variables  s'  v/ill  reflect  the  perturbation  factor 
;^  ,  and  the  reduced  costs  for  the  non-basic  will  indicated  entry 
into  the  basis.   When  arc  (i,j)  becomes  non-basic  at  its  upper 
bound,  the  remaining  spanning  tree  includes  only  the  arcs  with 
perfectly  accurate  observations.   Hence,  the  dual  variables  are 
again  the  exact  scale  values,  and  all  reduced  costs  are  zero 
except  for   ^'i-i   >  which  is  negative  and  at  its  upper  bound. 
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3.   With  a  sniall  number  of  inconsistent  observations  (<<k) 
involving  mutually  independent  stimulus  pairs  in  an  otherwise 
infallible  set  of  data,  the  T  ,  procedure  recovers  the  true  scale 
values  so  long  as  it  finds  a  basic  feasible  solution  where: 

a .  all  the  arcs  representing  the  inaccurate  observations 
are  non-basic. 

b .  all  of  the  arcs  representing  the  observations  that  are 
higher  than  their  true  value  are  at  their  lower  bound. 

c .  all  of  the  arcs  representing  the  observations  that  are 
lower  than  their  true  value  are  at  their  upper  bound. 

This  result  follows  from  the  line  of  analysis  pursued 
in  statement  2  above.   Clearly,  so  long  as  the  method  finds  such 
a  spanning  tree  solution,  all  reduced  costs  for  the  non-basic 
arcs  representing  inaccurate  observations  higher  than  their  true 
values  will  be  positive,  and  reduced  costs  for  the  non-basic  arcs 

at  their  upper  bound  will  be  negative.   The  dual  variables  for 
this  solution  will  be  determined  from  the  costs  of  the  basic 
arcs,  which  are  all  accurate  observations;   thus,  the  true  scale 
values  will  be  recovered. 

Once  the  inaccuracies  in  the  observation  begin  to 
affect  stimulus  pairs  with  common  elements  (such  as  i,j  and  i,k), 
it  is  difficult  to  determine  how  the  error  is  affecting  the 
resulting  scale  estimates.   Our  diagnostic  is  unable  to  isolate 
these  instances  of  "overlapping"  error,  as  it  functions  to  adjust 
only  two  scale  estimates  at  a  time.   Because  scaling  problems 
typically  involve  a  rather  small  number  of  items  anyway,  more 
than  one  or  two  serious  inaccuracies  indicates  the  possibility  of 
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some  violation  of  the  assumptions  of  the  scaling  model. 

It  remains  to  be  seen,  through  soma  sort  of  empirical 
validation,  just  how  often  the  T   procedure  does  find  the 
" uncontaminated"  spanning  tree  solution,  and  further 
investigation  in  this  area  is  indicated. 

We  applied  the  T   procedure  to  the  contrived 
five-stimulus  example.   As  anticipated,  it  recovered  the  true 
scale  values  of  the  errorless  model,  except  for  a  translation  in 
scale  origin.   The  solution  procedure  for  the  five  node  minimum 
cost  network  flow  problem  is  shown  in  four  steps  in  Figure  III. 4. 
Initially,  the  arc  representing  the  inconsistent  observation 
between  stimuli  a  and  c  is  non-basic.   The  associated  cost  is  .15 
+  A  ,  where  .15  is  the  coefficient  for  the  errorless  model,  and 
the  perturbation  factor  A  equals  -.25.   Step  1  indicates  that 
for  A  <  0,  arc  (a,c)  should  enter  the  basis.   Once  arc  (a,c)  has 
entered  the  basis,  still  at  its  lower  bound,  several  other  arcs 
become  candidates  to  enter  the  basis,  as  shown  in  Step  2.   Only 
when  (a,c}  leaves  the  basis  in  Step  4  do  we  reach  an  optimal 
(once  again  degenerate)  solution.   Mosteller's  goodness-of- f i t 
test  reveals  a  negligible  difference  between  6  '  and   9"  for  all 
stimulus  pairs  except  a,c;   the  T   solution  confines  100%  of  the 
error  to  the  single  stimulus  pair  previously  identified  by  the 
diagnostic  as  suspicious,  and  does  so  regardless  of  the  magnitude 
of  the  error  factor  A  . 

In  summary,  the  T   scaling  procedure  appears  to  be  a 
highly  desirable  alternative  to  Thurstone's  least  squares 
approach.   Although  computationally  more  time  consuming,  the  T 
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Four  steps  of  the  minimum  cost  network  flow  solution 
procedure  for  the  five-stimulus  example. 

35 


scaling    methor]    can    us3    existing    network   packages    to    solve   very 

large    scaling    problems    (around    20    objects)     in    seconds.      When    the 

observed    data    reflect   exactly   the    true    form    of    the   model,    T 

I 

scaling  and  least  squares  estimate  the  scale  values  exactly. 

When  a  serious  inaccuracy  is  present  in  an  otherwise  accurate  set 

of  data,  the  T   procedure  can  still  recover  the  true  scale 
1 

values,  whereas  least  squares  cannot  always.   It  is  also 

important  to  note  that  T   scaling  does  not  fall  arey  to  the  same 

1 

weakness  that  L   regression  does:   no  matter  how  inaccurate  the 

1 

one  bad  observation  is,  T   scaling  still  recovers  the  true  scale 

1 
values  of  the  model. 
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IV.   Results  from  Applying  the  T   Procedure  to  Scaling  Problems 
from  the  Literature 

We  applied  the  sensitivity  diagnostic  and  the  T 
procedure  to  certain  Thurstonian  scaling  problems  from  the 
literature,  in  order  to  determine  their  collective  effectiveness 
in  identifying  and  resolving  potential  trouble  with  real  paired 
comparisons  data.   The  first  case  involves  a  subset  of  the  1948 
American  League  baseball  data  presented  by  Mosteller  (1951). 
Each  one  of  five  teams  --  Cleveland,  Boston,  New  York, 
Washington,  and  Chicago  —  played  22  games  against  each  of  the 
four  others.   The  proportion  of  games  each  team  won  from  the 
other  team,  analogous  to  the  proportion  of  times  one  team  is 
judged  better  than  another,  is  shown  in  Chart  IV. 1. 

The  least  squares  scale  estimates  from  the  Thurstonian 
scaling  model  (shown  in  Chart  IV. 1)  reveal  a  rather  disappointing 
configuration.   The  scale  is  split  by  a  wide,  empty  interval, 
with  Chicago  and  Washington  lumped  together  at  the  low  end  of  the 
scale  and  Boston,  Cleveland,  and  New  York  almost  on  top  of  one 
another  at  the  high  end. 

Kendall  and  Smith's  coefficient  of  consistency  for 
these  data  is  .80,  indicating  some  element  of  intransi tiv ity  in 
this  collective  ordering.   The  one  circular  triad  in  the  data 
occurs  with  New  York,  Cleveland,  and  Boston:   New  York  won  over 
50%  of  the  games  it  played  against  Cleveland,  Cleveland  won  over 
50%  of  its  games  against  Boston,  and  yet  Boston  won  over  50%  of 
its  games  against  New  York.   A  coefficient  value  of  .80,  however, 
does  not  conclusively  demonstrate  the  failure  of  the  comparisons 
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method  to  systematically  discriminate  between  teams;   the 
probability  that  one  or  fewer  intransitive  triads  occur  due  to  an 
unsystematic  (random)  series  of  judgments  is  low,  less  than  .24. 
If  it  is  not  the  entire  method  which  is  at  fault,  then  the  error 
might  be  due  to  a  problematic  comparison  between  two  teams  that 
is  interfering  with  the  process  of  determining  scale  estimates 
for  the  "best"  of  the  five  teams. 

The  sensitivity  diagnostic  reveals  that  almost  30%  of 
the  discrepancy  in  fit  is  eliminated  if  the  scale  estimates  for 
New  York  and  Boston  are  adjusted.   One  possible  reason  for  this 
error  is  that  some  exogenous  factor  operated  on  the  series  of 
games  between  these  two  teams  to  produce  results  inconsistent 
with  the  other  series.   During  the  era  of  the  Yankees'  supremacy 
in  the  American  League,  it  was  often  said:   "The  New  York  Yankees 
are  the  champions  of  the  world,  but  the  Red  Sox  are  champions  of 
the  Yankees,"  because  the  Red  Sox  seemed  able  to  beat  the  Yankees 
fairly  consistently,  even  though  the  Yankees  at  that  time  had  the 
best  all-around  record  in  baseball.   On  the  assumption  that  this 
discrepancy  might  have  been  due  to  home  field  conditions, 
"rivalry",  or  a  variety  of  other  exogenous  conditions  not  common 
to  the  series  played  between  the  other  teams,  we  applied  the  T 
scaling  procedure  and  assessed  the  ultimate  effect  on  the  fit  of 
the  model . 

The  T   scale  estimates  shov/n  in  Chart  IV. 1  reflect  a 
1 

noticeably  different  configuration.   The  scale  positions  for  New 
York  and  Boston,  which  were  almost  identical  in  the  least  squares 
solution,  are  now  widely  separated,  clearly  identifying  Nev/  York 
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as  the  "best"  tear,  of  the  five.   These  results  are  quite  similar 

to  the  five-stimulus  example  of  section  II.   The  T   procedure 

produces  scale  estimates  that  largely  ignore  the  single 

substantial  source  of  inconsistency  in  the  model.   In  the 

baseball  data,  where  one  series  of  games  should  not  have 

inordinate  influence  on  the  determination  of  the  "best"  team,  the 

T   procedure  provides  a  solution  that  is  informative  and 
1  " 

conceptually  appealing. 

The  second  example,  a  scaling  of  attitude  statements  on 
the  participation  of  the  United  States  in  the  Korean  War,  is 
taken  from  Hill  (1953).   Hill  selected  a  subset  of  seven 
prescaled  attitude  statements  that  he  deliberately  biased  toward 
the  favorable  side.   His  hypothesis  was  that  where  statements 
were  concepcually  closer  on  the  " favorabl e/unf avorabl e" 
continuum,  there  would  be  more  inconsistency  reflected  in  the 
observational  data.   Kendall  and  Smith's  coefficient  equals 
unity,  indicating  the  absence  of  any  circular  triad.   Clearly, 
however,  there  is  some  form  of  inconsistency  affecting  the  data. 
Mosteller's  fit  criterion  for  the  least  squares  estimates  is  poor 
for  n  =  94  comparisons  for  each  stimulus  pair. 

The  sensitivity  diagnostic  reveals  that  no  single 
stimulus  pair  reduces  the  discrepancy  in  fit  by  more  than  25%; 
using  the  diagnostic  to  adjust  any  one  of  16  out  of  21  pairs  in 
the  lower  diagonal  of  X'  does  not  reduce  the  error  by  more  than 
10%.   Thus,  the  sensitivity  procedure  does  not  identify  any 
source  of  limited  intransi ti v ity  in  the  data. 

The  T   scale  estimates  are  largely  similar  to  those 
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given  by  the  least  squares  approach,  indicating  tnat  when  the 
source  of  inconsistency  in  the  data  is  not  limited  or  well 
defined,  the  T-i  procedure  does  at  least  as  well  in  estimating  the 
true  scale  values  as  the  Thurstonian  least  squares.   The  results 
are  presented  in  Chart  IV. 2  below. 

In  conclusion,  we  can  use  the  sensitivity  diagnostic 
presented  above  to  determine  where  problems  with  inconsistency 
appear  in  the  observed  data,  how  much  the  occurrence  of 
inconsistency  degrades  the  fit  of  the  model,  and  the  nature  of 
the  distortion  of  the  scale  estimates.   This  sensitivity  analysis 
involves  discounting  a  single  stimulus  pair  at  a  cime,  and  the 
method  is  straightforward  and  computationally  simple.   The 
results  provide  a  better  idea  of  problems  v/ithin  the  data  and 
indicate  when  there  is  a  need  for  more  robust  scale  estimates 
that  discount  these  data  problems. 

These  more  robust  estimates  may  be  obtained  by  solving 
the  minimum  cost  network  flow  problem  outlined  above  as  T 

T 
J. 

scaling.  The  procedure  provides  scale  estimates  that  are  not 
inordinately  influenced  by  the  presence  of  limited  sources  of 
inaccuracy  in  the  data. 
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1.  ]  suppose  the  U.S.  has  no  choice  but  to  continue  the  Korean  War. 

2.  Ke  should  he  willing  to  give  our  allies  in  Korea  Tnore  money  if  they  need  it 

3.  Withdrawing  out   troops  from  Korea  at  this  time  would  only  make  matters  wor 

4.  The  Korean  War  might  not  be  the  best  war  to  stop  communism,  but  it  was  the 
only  thing  we  could  do. 

5.  Winning  the  Korean  War  is  absolutely  necessary  whatever  the  cost. 

6.  We  are  protecting  the  United  States  by  fighting  in  Korea. 

7.  The  reason  we  are  in  Korea  is  to  defend  freedom. 
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